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Abstract — We consider the Additive White Gaussian Noise 
channel with Binary Phase Shift Keying modulation. Our aim is 
to enable an algebraic hard decision Bounded Minimum Distance 
decoder for a binary block code to exploit soft information 
obtained from the demodulator. This idea goes back to Forney [1], 
[2] and is based on treating received symbols with low reliability 
as erasures. This erasing at the decoder is done using a threshold, 
each received symbol with reliability falling below the threshold is 
erased. Depending on the target overall complexity of the decoder 
this pseudo-soft decision decoding can be extended from one 
threshold T to z > 1 thresholds T\ < ■ ■ ■ < T z for erasing 
received symbols with lowest reliability. The resulting technique 
is widely known as Generalized Minimum Distance decoding. 
In this paper we provide a means for explicit determination of 
the optimal threshold locations in terms of minimal decoding 
error probability. We do this for the one and the general z > 1 
thresholds case, starting with a geometric interpretation of the 
optimal threshold location problem and using an approach from 
[3]. 

I. Introduction 

The concept of concatenated codes was introduced by 
Forney in 1966 [1]. Concatenated codes consist of an inner and 
an outer code, a decoder for the concatenated code includes 
their associated decoders. Encoding is done such that the infor- 
mation block to be transmitted is first encoded using the outer 
code and then the symbols of the resulting outer codeword 
are encoded using the inner code. At the receiver side first the 
decoder for the inner code calculates estimates for the outer 
codeword symbols. Then, the decoder for the outer code tries 
to reconstruct the transmitted codeword utilizing the estimates 
from the inner decoder as inputs. In his original work, Forney 
proposed Generalized Minimum Distance (GMD) decoding, 
which extends simple single-trial decoding of concatenated 
codes to multiple decoding trials. More precisely, Forney 
specified GMD decoding for an integer z > (d — l)/2 of 
decoding trials, where d is the minimum Hamming distance 
of the outer code. For smaller values of z, Weber and Abdel- 
Ghaffar later introduced the term reduced GMD decoding [4] . 
GMD decoding relies on an outer error/erasure decoder and 
works as follows. In each decoding trial, an increasing set of 
most unreliable symbols obtained from the inner decoder are 
erased. The resulting intermediate word is fed into the outer 
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error/erasure decoder, which calculates an outer codeword 
estimate. Potentially, each decoding trial results in a different 
outer codeword estimate so some means of selecting the "best" 
estimate needs to be provided. 

Let the number of performed decoding trials be z. We do not 
distinguish between reduced GMD decoding and full GMD 
decoding and allow z to be any non-zero natural number 
independent of the code parameters. In practice, erasing of the 
most unreliable symbols is accomplished using a set of real- 
valued thresholds {T x , ■ . . ,T Z } with T x < ■ ■ ■ < T z . If the 
reliability value of a symbol falls below threshold 7j in the i-th 
decoding trial, then this symbol is marked as erasure in this 
trial. The threshold version of GMD decoding was presented 
by Blokh and Zyablov [6]. 

In this paper we consider a special case of a code con- 
catenation, i.e. the case where the inner "code" is Binary 
Phase Shift Keying (BPSK) modulation and the outer code is 
a linear binary code with an error/erasure Bounded Minimum 
Distance (BMD) decoder. Such decoders are well-known 
for certain important code classes, e.g. for Bose-Chaudhuri- 
Hocquenghem (BCH) codes [5], 

Our work is organized as follows. In Section HI] we give 
basic definitions and notations that are used in the remainder 
of the paper. Section [HI] considers the most simple case of 
(reduced) GMD decoding, i.e. error/erasure BMD decoding 
with one single threshold. Its optimal location is derived using 
a geometric approach. Note that we use "optimal" as an 
abbreviation for "optimal in terms of minimal decoding error 
probability". In Section [IV] we consider the general case of 
z > 1 thresholds before we finally wrap up the paper with 
conclusions and further research perspectives in Section |V] 

II. Definitions and Notations 
Assume an Additive White Gaussian Noise (AWGN) channel 
with BPSK modulation, let the transmitted symbols be w.l.o.g. 
x € {—1, +1}, i-e- the modulator performs for every transmit- 
ted binary value c E {0, 1} the operation x = (— l) c and the 
transmit signal power is fixed to E s = 1, Hence, the standard 
deviation of the AWGN channel is a = J Nq /2. We define 
the probability that for given a a transmitted symbol x results 
in a received symbol y within the real interval [a, b] as 
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For simplicity we also define the negative logarithmic proba- 
bility 

l a (a,b) := -ln(p a (a,b)) . 

As outer code we assume a linear binary (n, k, d) code 
C with code length n, dimension k and minimum Hamming 
distance d. An error/erasure BMD decoder for C can decode 
error patterns with r erasures and e errors as long as 



2e + T < d. 



(1) 



A codeword c = (cq, . . . , c n _i) S C is mapped to a 
vector x = (xq, . . . , x n -i) G {— 1, +1}™ by the modulation 
function described above. At the receiver side, the vector 
y = (yo, . . . ,y n -i) G R™ is received. For each received 
symbol holds yj = Xj + £, where £ is the realization of a 
Gaussian noise process with mean Xj and standard deviation 
a. 

III. The Single Threshold Case 

We start our considerations with the case of one single 
threshold < T < 1 = E s . This means that the following 
quantization-and-erasing function is applied to any received 
symbol yj. 

{0,1}UX 

1 ;if yj <-T 
;ify,->T 
X ; if - T < Vj < T 

The obvious extension of <pT to vectors is 

<My) := (<Mz/o), <h{vn-\)) ■ 

Note that since C is a linear code and the threshold location 
is symmetric, we can restrict our considerations in the follow- 
ing w.l.o.g. to the case Vj = 0, . . . , n — 1 : Xj = +1, i.e. 
transmission of the all-zero codeword. 

Consider the probability P a that the decoder produces an 
error, i.e. the probability that it either returns no codeword or 
a wrong codeword. We make use of the abbreviated notation 
Px '■= Pa{—T, T) and p e :— p a (—oo, — T) for the erasure and 
error probability, respectively. Similarly, we define the negative 
logarithms l x := — h\(p x ) and l e := — ln(p e ). 

n n—r 



t=0 e=i T 



r, e, n — t — e 



pIpI 0--Px-Pe) n T 6 , (2) 



where t T :— [^-j 1 ] • For good channel conditions, i.e. small 
values of a, we obtain the approximation 

P a « max \ ( n ) pip 1 / 

o<r<d I \t, t T , n - t - ej 

Note that the last term in (f5]l can be neglected since it is 
close to one. Transforming this approximation into negative 



logarithmic form we obtain 

— hi(P a )~ min {tL +LL 

0<T<d 



tr 



ln(2) nH( T /n) + (n - r) H 



where H(-) denotes the binary entropy function. Since it only 
assumes values between and 1 and l x and l e tend to infinity 
for small a, we can further approximate 



\n(P a ) w min {t l x + t T l e } . 

0<r<d 



(3) 



Now we return to the non-abbreviated notation and define 
the goal function 

g a (r,T) := r l a (-T,T) + ^^l a (-oc,-T). (4) 

We omit the ceiling operation from t T to obtain a function 
which is linear in r. By means of (|3} we observe that 
the minimum of the goal function over r approximates the 
negative logarithmic decoding error probability as long as the 
channel standard deviation a is small. The behavior of the 
goal function for several thresholds is depicted in Figure Q] 
The number of erasures r is spread on the abscissa and each 
straight line represents one threshold < T < 1 = E s , the 
minimum of each straight line represents the approximated 
negative logarithmic error probability for this specific thresh- 
old. The decoder's aim is to select the threshold T such that the 
minimum is maximized since this yields the minimal decoding 
error probability. 

The following theorem provides a necessary and sufficient 
criterion for the optimal high-SNR threshold T a . 

Theorem 1 For good channel conditions, i.e. small channel 
standard deviation a, T a is the optimal threshold if and only 
if the following equation is fulfilled. 



\/p<t(-00, —To) — Pa{-T a ,T a ). 



(5) 



Proof: Since the goal function is linear in t, it assumes 
its minimum at one of the two extremal points g a (0 : T) and 
g^id, T) which means that ^ reduces to 



- HP* 



,{ 9rT {Q,T),g a {d,T)}. 



Let T a be such that g a (Q,T a ) — g a (d,T a ). Inserting the 
definition of the goal function shows that this is equivalent 
to 

p a (-oo,-T a )$ = Pa {-T a ,T a ) d . (6) 

Assume that threshold T" ^ T a is optimal. This gives 

Pa (-T\T') = Po{ T ai T a ) + A and 
p CT (-oo, -T') = p CT (-oo,-T CT ) - A, 

where A > if V > T a and A < if T < T a since both 

Pa(-oo, -To) +Po(-To,T a ) +Po(To, oo) = 1 

and 

Po(-oo,-T')+po(-T',T')+po(T',oo) = 1 



must be fulfilled. If we transform 
logarithmic domain we obtain 



([3]) back to the non- 



■oo,-r)>,p a (-r ; 



(7) 



By using threshold T", we increase one of the two expressions 
in (0 and thereby also the maximum of both expressions. But 
this means that the decoding error probability is increased and 
thus T' 7^ T a cannot be the optimal threshold. Hence, T a is 
optimal and the statement of the theorem is proved. ■ 
Theorem[T]allows for the following geometric interpretation. 
The optimal threshold T a is the specific threshold for which 
the goal function is a perfectly horizontal line in Figure Q] 
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Fig. 1. Four exemplary instances of the goal function for a = 0.4 and 
d = 31. The minimum of each instance represents the negative logarithmic 
decoding error probability achievable with the specific threshold. 

Figure [2] shows in the upper curve the optimal high-SNR 
threshold T CT for SNR values between and 20 dB, the plot 
was obtained by numerically solving equation (Q. Each point 
on the curve represents the optimal threshold for the specific 
SNR value, i.e. the threshold for which the goal function 
is independent of r and thereby a perfect horizontal line in 
Figure Q] 

Obtaining an analytic solution for equation (O is non-trivial 
since it essentially means solving 



Erf 



where 



T — 1 
V2a 



Erf 



T+ 1 
V2a 



Erf(a) := -= 



= 2 Erfc 



dx 



T+l 
V2a 



is the error function and Erfc(a) := 1 — Erf(a) is its 
complementary counterpart. However, using the well-known 
approximation 

Erfc(a) - Xi ~ 



'ira 



from [1] which is good for a > 1 we can at least for good 
channel conditions (i.e. small standard deviation a) obtain the 
analytic solution 



T a := 3 



3a 2 - 



(8) 
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Fig. 2. Optimal threshold location T a for SNR values between and 20 dB, 

a = y i 10~ 10 . The upper curve is the numerically calculated optimal 
high-SNR threshold given by Theorem fT] and the middle curve is the analytic 
high-SNR threshold from {8). The lower curve is the general optimal threshold 
for the full SNR range and was obtained by numerically minimizing |2) for 
a binary code with length 127 and minimum distance 63. 



which approximates the optimal high-SNR threshold location 
for given a. Figure [2] compares the numerical and the ana- 
lytical optimal high-SNR threshold locations with the general 
optimal threshold. Note that the analytic approximation is only 
valid for high SNR values. This imposes no problem since the 
numerically calculated threshold given by Theorem Q] is also 
only valid in the high SNR regime. 

We can utilize the analytic optimal threshold location to 
show the gain of single-threshold error/erasure BMD decoding 
over errors-only decoding for good channel conditions. If 
the optimal threshold is used, (0 allows to approximate the 
decoding error probability by 



Pa( 



It is further well-known that the error probability of errors- 
only BMD decoding can be approximated by 

d 

"bmd «_p CT (-oo,0) 2 . 

Now we let a — * 00. From ^ we get T a = 3 - 2 v / 2. We can 
then solve 



Po 



! (-00,-3 + 2^2) 2 = P(T3 (-00,0)5 ^ 



Erfc 



2-1 



= Erfc 



/ 1 



01 



\\/2' 



(T-2 



2V2 (V2 - 1 ) o- 2 



to see that the gain is 20 log 10 (2y/2(y/2- l)) w 1.4 dB. 
This is in line with results obtained in the original works by 
Forney [1], [2]. 

IV. The General z Thresholds Case 

We advance to the general case, where z > 1 thresholds 
are used to determine which of the received symbols are 
considered as unreliable and thus are erased. The situation 
is depicted in Figure [3] We consider a set of z thresholds 



T := {Ti, . . . ,T Z } fulfilling < T x < • • • < T z < 1 = E s 
and z trials of error/erasure decoding for the received vector 
y are performed. The first one with decoder input </>Ti (y), the 
second one with decoder input 4>t 2 (y) and so on, where the 
quantization-and-erasing function is 



{0,1}UX 



if Vj < -Ti 
if y J >T l 
if - Ti < y < T 



The result of this approach can obviously be a list of code- 
words. In our simplified setting, where the inner code is 
BPSK modulation, the selection of the best guess from this 
result list is straightforward - it can be realized by applying 
the modulation operation to the binary symbols of all result 
list entries and choosing the one with the smallest Euclidean 
distance to the received vector y. In the z > 1 thresholds 
case we denote the event that none of the list entries is the 
originally transmitted codeword or that the list is empty as 
decoding error with probability P a . 




Fig. 3. Sketch of the threshold locations depicting the possible erasure 
intervals depending on thresholds < T\ < ■ ■ ■ < T z < 1 = E s . 



In support of a dense notation we define the following 
abbreviated probabilities and their negative logarithmic coun- 
terparts. 



Pi ■= Pcr(-00, -T z ) 


and 


k 


= -ln(>)> 


Pc ■■= p„{-T x ,T\) 


and 


lc 


:= -\n(p c ), 


p r := p a (T z , 00 ) 


and 




:= -ln(p r ), 


P t :=p<t(-T 4+ i,-T 4 ) 


and 


h 


= -!*(£.). 


Pi ■= Pa(Ti,T i+1 ) 


and 


I; 


= -HPi), 



within the received vector y, that fall into the specific intervals. 

ti := received symbols within (— oo, —T z )], 

t c := received symbols within (—Ti, Ti), 

t r := received symbols within [T z , oo), 

^ := received symbols within (— Tj+i, —Ti], 

ti := received symbols within [Ti, Tj+i), 

where again i — 1, ...,z — 1. Some intervals and their 
corresponding abbreviated probability and number of symbols 
are depicted in Figure [3] With the previous definitions, the 
decoding error probability can be stated explicitly by 



E 



ti, t c , t r ,t-±, t\ 



ttz'tz 



tl t, t, 

PfPcPr 



Hit, 



(9) 

where the sum is over all non-negative integers satisfying the 
two conditions 



C 



t[ +t c + t r + J2i=i (ti +U) — n and 
Vi = 1, . . . ,2 : 



The first condition in C is obvious, it simply states that 
the total number of received symbols must equal the code 
length n. The second condition represents a decoding error 
for error/erasure BMD decoding of all input vectors <pr t , 
i = 1, . . . , z. In this case, the number of errors for threshold 
T is et, = ti + Y^tt=i tv anc ' tne num ber of erasures is 
TT t — t c + Yju=idv + ^f) as can be easily seen by means 
of Figure [3] The second condition then follows from (HJ. 

We can obtain an approximation of P a if we assume that 
the second condition is fulfilled with equality for all thresholds 
T G T. For i = 1, . . . , z — 1 we can then substract 



2(tj 



2-1 

v— i+1 v— 1 



from 



2(t l +Y / t,)+tc+J2^ 



U = d 



and see that it holds 

Vi = 1, 



2—1:1= U 



(10) 



Obeying this equality we obtain the new condition 

z-l 

C* := 2ti +t c + 2^2ti = d 
«=i 

for the sum in (|9]). 

For good channel conditions, i.e. small values of the channel 
standard deviation a, the decoding error probability can be 
approximated by 



where i = 1, . . . , 2 — 1. We also define the numbers of symbols 



P a » max |p*'p*<= Ylip^Y- 



(11) 



The term p' r in (O can be neglected since for small a it is 
close to one. Furthermore, dTOb is used to group the coefficients 
of the product under a single exponent. By transforming ( fTTT i 
into negative logarithmic form we obtain 

- ln(P CT ) « min \tik + t e l c + J^Gi + *0 j , ( 12 ) 
which contains the goal function 
9a(ti , , . . . , t z _j, Ti, . . . , T z ) :— 

z-l 

tih+t c l c + Y,Uk + h)- (13) 

i=l 

The following theorem, whose proof exploits the linearity 
of the goal function in ti,t c ,ti, ■ . . ,t z , provides a necessary 
and sufficient criterion for the optimal set of thresholds T a . 

Theorem 2 For good channel conditions, i.e. small channel 
standard deviation a, T a :— {T lo ., . . . , T z a } is the optimal set 
of thresholds if and only if the following system of equations 
is fulfilled. 

yJp a {-oo,-T z ^) = p CT (-2i i0 .,Ti )£r ), 

Ptr(-Tl,v,Tl t0 ) = \J Pa{-T 2 ,a,-T 1 ^)p r j(T 1 ^,T 2 ,a) 

and 

Vi = l,...,z-2: 

Pcr(— —Ti^ a )p a (Ti^ a , Ti + i :Cr ) = 

Proof: Due to the linearity of the goal function in 
ti, t c , t±, . . . , t z , it assumes its minimum at one of the extremal 
points given by condition C*, i.e. ( TT2| > reduces to 

-ln(P CT ) w niin| ff<r ^,0,0,... ) 0,r ls ... ) r,y 

5ct (o,d,o,...,o,r 1; ...,r 2 ), 

0, (Vo,^...,0,T 1; ...,T^ , 

5ct (^0,0,0,. ..,^,T 1 ,...,T z ^y (14) 

Let 7^- be the set of thresholds such that the value of the 
goal function is equal at all extremal points. Returning to the 
non-logarithmic representation, (fl4"l > becomes 

f — rl — — "1 

P a fa max |pf .Pc^pf , . . . ,£ 2 pl | . (15) 

Let T' be a set of thresholds where at least one threshold 
is different than in T a . Assume that T is optimal. The only 
possible way for T to decrease P„ would be to decrease 
all terms in (TT~5T > simultaneously. This is impossible since 
the probabilities necessarily sum up to one, hence 7^ is the 
optimal set of thresholds and the statement is proved. ■ 



V. Conclusions and Outlook 

In this paper we considered a special case of (reduced) 
GMD decoding, i.e. transmission over an AWGN channel, 
BPSK modulation and error/erasure BMD decoding of a 
binary code. Starting from the single threshold case where 
only one decoding trial is performed, we generalized our 
considerations to the the z > 1 thresholds case. For both 
cases, we derived thresholds for erasing unreliable received 
symbols, that are optimal in terms of the achievable minimal 
decoding error probability. To simplify usage of our results 
in practical applications we gave the approximated analytic 
threshold location for the single threshold case. 

We showed that a gain of 1.4 dB over errors-only BMD 
decoding can be achieved with single-trial error/erasure de- 
coding. We did not address the error probability of GMD 
decoding with z > 1 thresholds in this paper. However, Forney 
showed that for good channel conditions the gain over errors- 
only decoding is approximately 3dB if z > (d — l)/2, i.e. in 
case of full GMD decoding. 

Our work on the subject is continued with the goal to 
generalize the considerations from this paper to concatenated 
codes where the inner code is a binary block code and the 
outer code is a (potentially interleaved) Reed-Solomon code. 
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